A Graph Matching Method for Historical Census Household Linkage

نویسندگان

  • Zhichun Fu
  • Peter Christen
  • Jun Zhou
چکیده

Linking historical census data across time is a challenging task due to various reasons, including data quality, limited individual information, and changes to households over time. Although most census data linking methods link records that correspond to individual household members, recent advances show that linking households as a whole provide more accurate results and less multiple household links. In this paper, we introduce a graph-based method to link households, which takes the structural relationship between household members into consideration. Based on individual record linking results, our method builds a graph for each household, so that the matches are determined by both attribute-level and record-relationship similarity. Our experimental results on both synthetic and real historical census data have validated the effectiveness of this method. The proposed method achieves an Fmeasure of 0.937 on data extracted from real UK census datasets, outperforming all alternative methods being compared.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Temporal group linkage and evolution analysis for census data

The temporal linkage of census data allows the detailed analysis of population-related changes in an area of interest. It should not only link records about the same person but also support the linkage of groups of related persons such as households. In this paper, we thus propose a new approach to both temporal record and group (household) linkage for census data and study its application for ...

متن کامل

A Supervised Learning and Group Linking Method for Historical Census Household Linkage

Historical census data provide a snapshot of the era when our ancestors lived. Such data contain valuable information that allows the reconstruction of households and the tracking of family changes across time, allows the analysis of family diseases, and facilitates a variety of social science research. One particular topic of interest in historical census data analysis are households and linki...

متن کامل

Multiple Instance Learning for Group Record Linkage

Record linkage is the process of identifying records that refer to the same entities from different data sources. While most research efforts are concerned with linking individual records, new approaches have recently been proposed to link groups of records across databases. Group record linkage aims to determine if two groups of records in two databases refer to the same entity or not. One app...

متن کامل

Impact of Small-Holders’ Cattle Fattening on Household Income Generation in Fadis District of Eastern Hararghe Zone, Oromia, Ethiopia

At the household level, livestock plays a critical economic and social role in pastoralists and at the household level, livestock plays a critical economic and social role in pastoralists and smallholder farm households. The objectives of this study were to analyze factors affecting participation in cattle fattening and its impacts on household income in Fadis district of Eastern Hararghe. Both...

متن کامل

An unconstrained statistical matching algorithm for combining individual and household level geo-specific census and survey data

The Population Census is an important source of statistical information in most countries that is capable of producing reliable estimates of population characteristics for small geographic areas. One limitation of a census is that there are many population characteristics that cannot be collected due to respondent burden or cost. This means that statistical agencies have to conduct population b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014